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DETAILED ACTION 

Continued Examination Under 37 CFR 1.114 

1 . A request for continued examination under 37 CFR 1.114, including the fee set forth in 
37 CFR 1 .17(e), was filed in this application after final rejection. Since this application is 
eligible for continued examination under 37 CFR 1.1 14, and the fee set forth in 37 CFR 1.17(e) 
has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 
37 CFR 1.1 14. Applicant's submission filed on January 29, 2008, has been entered. 



Claim Rejections - 35 USC § 101 

2. 35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or 
any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and 
requirements of this title. 

Claims 26-29 and 32-33 are rejected under 35 U.S.C. 101 because the claimed invention 
is directed to non-statutory subject matter. 

Claims 26-29 and 32-33 are directed to a computer program. Computer programs 
claimed as the description or expressions of the programs are not physical "things." They are 
neither computer components nor statutory processes, as they are not "acts" being performed. 
Such claimed computer programs do not define any structural and functional interrelationships 
between the computer program and other claimed elements of a computer, which permit the 
computer program's functionality to be realized. Since a computer program is merely a set of 
instructions capable of being executed by a computer, the computer program itself is not a 
process (see USPTO Interim Guidelines for Patent Subject Matter Eligibility) and the Office 
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treats a claim for a computer program, without the computer-readable medium needed to realize 
the computer program's functionality, as nonstatutory functional descriptive material. When a 
computer program is claimed in a process where the computer is executing the computer 
program's instructions, the Office treats the claim as a process claim. When a computer program 
is recited in conjunction with a physical structure, such as a computer memory, the Office treats 
the claim as a product claim. 

Claim Rejections - 35 USC §103 

The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

Claims 1-3, 5-6, 10-16, and 35-36 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Mitchell (US Patent No. 5,799,273) in view of Ball et al (US Patent No. 
6,240,391). 

3. Regarding claims 1 and 35, Mitchell discloses a system for relating words in an audio file 
to words in a text file, comprising: retrieving a text file comprising a plurality of textual words 
(col. 6, lines 20-29); generating an audio file comprising a plurality of audible words based on 
the text file (col. 6, lines 9-19); storing information relating each audible word to a 
corresponding textual word (col. 6, lines 48-65); and an electronic marker that indicates the 
position of the audible word within the text file (abstract; col. 9, lines 13-25 in which Mitchell 
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discloses the user can delete and/or insert text and the recognition interface updates the links 
between the recognized word and the associated audio components such that link data is 
amended to indicate the correct character position of the word in the text). 

Mitchell does not disclose that the electronic marker is within the audio file, however 
storing markers that link portions of audio and portions of text with the audio file was well 
known in the art and it would have been obvious to one of ordinary skill at the time of the 
invention to provide for the electronic marker embedded in the audio file that indicates the 
position of the audible word within the text file so as to aid the user in reviewing the text as the 
audio is output. 

Mitchell does not teach the audio file is generated by converting the textual words to a 
plurality of audible words, or that the audio file is transmitted or available to a user of a 
telecommunications device. Ball (col. 4, line 66 to col. 5, line 3; col. 6, lines 23-47; col. 6, lines 
51- col. 7, line 6) discloses a system and method for assembling and presenting structured voice 
mail messages, which implements a telephone/IP server for providing the functions of audio 
play and record, text-to-speech synthesis, dual-tone multi-frequency (DTMF) (touch-tone) 
recognition, automatic speech recognition (ASR) processing, and other call control functions 
necessary for interactive audio services, and specifically teaches the textual messaging elements 
of the structured voice mail message are converted to a speech signal by a text-to-speech 
processor, and combined with each other and audio fragments, converted from their data files. 

It would have been obvious to one of ordinary skill at the time of the invention to modify 
the system Mitchell to provide the linked audio and text data for transmission to a user of a 
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telecommunications device, so as to generate structured voice mail messages so as to provide 
the voice mail recipient the ability to access any audio, text, video or multi-media type message. 
Regarding claim 2, Mitchell discloses the textual words comprise ASCII text (col. 5, lines 

59-67). 

Regarding claim 3, Mitchell discloses the audio file is stored in the form of a WAV file 
(col. 6, lines 9-29; col. 13, lines 26-30). 

Regarding claim 5, Mitchell discloses the information comprises a file map relating a 
location of each textual word within the text file to a location of the corresponding audible word 
in the audio file (col. 6, line 48 to col. 8, line 3). 

Regarding claims 6 and 36, Mitchell discloses the method steps are performed by login 
embodied in a computer readable medium (col. 4, line 66 to col. 5, line 36). 

Regarding claim 15, Mitchell discloses a system for relating words in an audio file to 
words a text file, comprising: retrieving a text file comprising a textual word (col. 6, lines 20- 
29); generating an audible word corresponding the textual word (col. 6, lines 9-19); storing the 
audible word in an audio (col. 6, lines 9-29; col. 13, lines 26-30); storing a file map, the file map 
comprising: a first location locating audible word within the audio file (Figures 3-4; col. 6, line 
48 to col. 7, line 30); and a second location locating the textual word within the text file (Figures 
3-4; col. 6, line 48 to col. 7, line 30). Mitchell discloses (col. 7, lines 9-10) the word positions 
are determined by determining the counter number indicating the position of the first character in 
the text for the word, which reads on initializing and incrementing the counter to identify textual 
words within the text file. 
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Mitchell does not teach the audio file is generated by converting the textual words to a 
plurality of audible words, or that the audio file is transmitted or available to a user of a 
telecommunications device. Ball (col. 4, line 66 to col. 5, line 3; col. 6, lines 23-47; col. 6, lines 
51- col. 7, line 6) discloses a system and method for assembling and presenting structured voice 
mail messages, which implements a telephone/IP server for providing the functions of audio 
play and record, text-to-speech synthesis, dual-tone multi-frequency (DTMF) (touch-tone) 
recognition, automatic speech recognition (ASR) processing, and other call control functions 
necessary for interactive audio services, and specifically teaches the textual messaging elements 
of the structured voice mail message arc converted to a speech signal by a text-to-speech 
processor, and combined with each other and audio fragments, converted from their data files. 

It would have been obvious to one of ordinary skill at the time of the invention to modify 
the system Mitchell to provide the linked audio and text data for transmission to a user of a 
telecommunications device, so as to generate structured voice mail messages so as to provide the 
voice mail recipient the ability to access any audio, text, video or multi-media type message. 

Regarding claim 16, Mitchell discloses repeating the steps the method plurality of textual 
words in the text file (col. 5, line 59 to col. 8, line 3; Figures 3-4). 

Regarding claim 10, Mitchell discloses a system for relating words in an audio file to 
words in a text file, comprising: retrieving a text file comprising a plurality of textual words (col. 
6, lines 20-29); generating an audible word corresponding to each textual word (col. 6, lines 9- 
29); and playing the audible words to a user in real time as the audible words are generated (col. 
8, line 52 to 10, line 2); and during the playing of the audible words, determining a current 
textual word corresponding audible word currently being played (col. 8, line 52 to col. 10, line 
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2). Mitchell discloses (col. 7, lines 9-10) the word positions are determined by determining the 
counter number indicating the position of the first character in the text for the word, which reads 
on initializing and incrementing the counter to identify textual words within the text file. 

Mitchell does not teach the audio file is generated by converting the textual words to a 
plurality of audible words with each audible word comprising media stream packets, or that the 
audio file is transmitted or available to a user of a telecommunications device. Ball (col. 4, line 
66 to col. 5, line 3; col. 6, lines 23-47; col. 6, lines 51- col. 7, line 6) discloses a system and 
method for assembling and presenting structured voice mail messages, which implements a 
telephone/IP server (and thus providing for the transmission of data that comprises media 
stream packets) for providing the functions of audio play and record, text-to-speech synthesis, 
dual-tone multi-frequency (DTMF) (touch-tone) recognition, automatic speech recognition 
(ASR) processing, and other call control functions necessary for interactive audio services, and 
specifically teaches the textual messaging elements of the structured voice mail message are 
converted to a speech signal by a text-to-speech processor, and combined with each other and 
audio fragments, converted from their data files. 

It would have been obvious to one of ordinary skill at the time of the invention to modify 
the system Mitchell to provide the linked audio media stream packets and text data for 
transmission to a user of a telecommunications device, so as to generate structured voice mail 
messages so as to provide the voice mail recipient the ability to access any audio, text, video or 
multi-media type message that has been stored. 

Regarding claim 11, Mitchell discloses the textual words comprise ASCII text (col. 5, 
lines 59-67). 
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Regarding claim 13, Mitchell discloses storing information about the audible word, the 
information comprising: an identifier for the textual word corresponding the audible word (col. 
6, line 48 to col. 8, line 3); and a time at which the audible word was played (col. 6, line 48 to 
col. 8, line 3; Figures 3-4). 

Regarding claim 14, Mitchell discloses the method steps are performed by login 
embodied in a computer readable medium (col. 4, line 66 to col. 5, line 36). 

5. Claims 7-8, 17, 30, and 32-33, are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Mitchell and Ball in view of Dionne (US Patent No. 6,068,487) and further in view of 
Frulla et al (US Patent No. 6,424,357). 

6. Regarding claims 7, 17, 30 and 32, Mitchell discloses a system for relating words in an 
audio file to words a text file, comprising: retrieving a text file comprising a textual word (col. 6, 
lines 20-29); generating an audible word corresponding the textual word (col. 6, lines 9-19); 
storing the audible word in an audio (col. 6, lines 9-29; col. 13, lines 26-30); storing a file map, 
the file map comprising: a first location locating audible word within the audio file (Figures 3-4; 
col. 6, line 48 to col. 7, line 30); and a second location locating the textual word within the text 
file (Figures 3-4; col. 6, line 48 to col. 7, line 30). Mitchell discloses (col. 7, lines 9-10) the 
word positions are determined by determining the counter number indicating the position of the 
first character in the text for the word, which reads on initializing and incrementing the counter 
to identify textual words within the text file. 
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Mitchell does not teach the audio file is generated by converting the textual words to a 
plurality of audible words, or that the audio file is transmitted or available to a user of a 
telecommunications device. Ball (col. 4, line 66 to col. 5, line 3; col. 6, lines 23-47; col. 6, lines 
51- col. 7, line 6) discloses a system and method for assembling and presenting structured voice 
mail messages, which implements a telephone/IP server for providing the functions of audio 
play and record, text-to-speech synthesis, dual-tone multi-frequency (DTMF) (touch-tone) 
recognition, automatic speech recognition (ASR) processing, and other call control functions 
necessary for interactive audio services, and specifically teaches the textual messaging elements 
of the structured voice mail message arc converted to a speech signal by a text-to-speech 
processor, and combined with each other and audio fragments, converted from their data files. 
It would have been obvious to one of ordinary skill at the time of the invention to modify the 
system Mitchell to provide the linked audio and text data for transmission to a user of a 
telecommunications device, so as to generate structured voice mail messages so as to provide the 
voice mail recipient the ability to access any audio, text, video or multi-media type message. 

Mitchell does not teach that the system identifies an audible word to be spelled in 
response to the command to spell; identifies a textual word in a text file corresponding to the 
audible word to be spelled; and audibly spell the textual word. Dionne teaches a method for 
having a reading machine spell a word, which includes retrieving a word to be spelled, 
displaying letters of the word, spelling the word and provides a text-to-speech output of the word 
(col. 3, lines 8-34). Dionne teaches that the system is useful in assisting individuals with 
learning disabilities or severe visual impairments. 
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It would have been obvious to one of ordinary skill at the time of the invention to modify 
the system of Mitchell to provide the spelling of words in the text, to aid in the editing of 
recognized text and in the correcting of recognition errors, for the purpose of assisting 
individuals with visual impairments with editing of text. 

Mitchell and Dionne do not teach the command input to the system is via a voice 
command. However, implementation of voice commands to allow for system functionality and 
control similar to that of hand-controlled input devices was well known in the art. 

Frulla discloses a voice input system has a microphone coupled to a computing device, 
with the computing device typically operating a computer software application. A user speaks 
voice commands into the microphone, with the computing device operating a voice command 
module that interprets the voice command and causes the graphical or non-graphical application 
to be commanded and controlled consistent with the use of a physical mouse (Figures 2-3; col. 
4, lines 57-64), and specifically teaches the system is advantageous in environments in which it 
is inconvenient or impractical to use a mouse, and thereby making the user interface more 
convenient and efficient for a user to input information and commands. 

It would have been obvious to one of ordinary skill at the time of the invention to modify 
the system of Mitchell to provide the spelling of words in the text, to aid in the editing of 
recognized text and in the correcting of recognition errors and to further provide voice command 
control, as suggested by Frulla, for the purpose of making the user interface more convenient and 
efficient for a user to input information and commands in situations in which using a physical 
mouse is impractical or cumbersome. 
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Regarding claims 8 and 33, Mitchell discloses repeating the steps the method plurality of 
textual words in the text file (col. 5, line 59 to col. 8, line 3; Figures 3-4). 

7. Claims 18-29 and 37 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Mitchell in view of Dionne (US Patent No. 6,068,487) and further in view of Frulla et al (US 
Patent No. 6,424,357). 

8. Regarding claims 18-29, and 37, Mitchell discloses a system which provides a user 
interface for relating words in an audio file to words a text file, comprising: retrieving a text file 
comprising a textual word (col. 6, lines 20-29); generating an audible word corresponding the 
textual word (col. 6, lines 9-19); storing the audible word in an audio (col. 6, lines 9-29; col. 13, 
lines 26-30); storing a file map, the file map comprising: a first location locating audible word 
within the audio file (Figures 3-4; col. 6, line 48 to col. 7, line 30); and a second location locating 
the textual word within the text file (Figures 3-4; col. 6, line 48 to col. 7, line 30). Mitchell 
discloses (col. 7, lines 9-10) the word positions are determined by determining the counter 
number indicating the position of the first character in the text for the word, which reads on 
initializing and incrementing the counter to identify textual words within the text file. 

Mitchell does not disclose that the electronic marker is within the audio file, however 
storing markers that link portions of audio and portions of text with the audio file was well 
known in the art and it would have been obvious to one of ordinary skill at the time of the 
invention to provide for the electronic marker embedded in the audio file that indicates the 
position of the audible word within the text file so as to aid the user in reviewing the text as the 
audio is output. 
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Mitchell does not teach that the system identifies an audible word to be spelled in 
response to the command to spell; identifies a textual word in a text file corresponding to the 
audible word to be spelled; and audibly spell the textual word. Dionne teaches a method for 
having a reading machine spell a word, which includes retrieving a word to be spelled, 
displaying letters of the word, spelling the word and provides a text-to-speech output of the word 
(col. 3, lines 8-34). Dionne teaches that the system is useful in assisting individuals with 
learning disabilities or severe visual impairments. 

It would have been obvious to one of ordinary skill at the time of the invention to modify 
the system of Mitchell to provide the spelling of words in the text, to aid in the editing of 
recognized text and in the correcting of recognition errors, for the purpose of assisting 
individuals with visual impairments with editing of text. 

Mitchell and Dionne do not teach the command input to the system is via a voice 
command. However, implementation of voice commands to allow for system functionality and 
control similar to that of hand-controlled input devices was well known in the art. 

Frulla discloses a voice input system has a microphone coupled to a computing device, 
with the computing device typically operating a computer software application. A user speaks 
voice commands into the microphone, with the computing device operating a voice command 
module that interprets the voice command and causes the graphical or non-graphical application 
to be commanded and controlled consistent with the use of a physical mouse (Figures 2-3; col. 
4, lines 57-64), and specifically teaches the system is advantageous in environments in which it 
is inconvenient or impractical to use a mouse, and thereby making the user interface more 
convenient and efficient for a user to input information and commands. 
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It would have been obvious to one of ordinary skill at the time of the invention to modify 
the system of Mitchell to provide the spelling of words in the text, to aid in the editing of 
recognized text and in the correcting of recognition errors and to further provide voice command 
control, as suggested by Frulla, for the purpose of making the user interface more convenient and 
efficient for a user to input information and commands in situations in which using a physical 
mouse is impractical or cumbersome. 

Response to Arguments 

9. Applicant's arguments filed January 29, 2008 have been fully considered but they are not 
persuasive. 

In response to applicant's argument that there is no suggestion to combine the references, 
the examiner recognizes that obviousness can only be established by combining or modifying the 
teachings of the prior art to produce the claimed invention where there is some teaching, 
suggestion, or motivation to do so found either in the references themselves or in the knowledge 
generally available to one of ordinary skill in the art. See In re Fine, 837 F.2d 1071, 5 
USPQ2d 1596 (Fed. Cir. 1988) and/« re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 
1992). In this case, it would have been obvious to one of ordinary skill at the time of the 
invention to provide the linked audio media stream packets and text data for transmission to a 
user of a telecommunications device, so as to generate structured voice mail messages so as to 
provide the voice mail recipient the ability to access any audio, text, video or multi-media type 
message that has been stored. 
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Allowable Subject Matter 

10. Claim 38 objected to as being dependent upon a rejected base claim, but would be 
allowable if rewritten in independent form including all of the limitations of the base claim and 
any intervening claims. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to ANGELA A. ARMSTRONG whose telephone number is 
(571)272-7598. The examiner can normally be reached on Monday-Thursday 1 1 :30-8:00 PM. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Patrick N. Edouard can be reached on 571-272-7603. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
like assistance from a USPTO Customer Service Representative or access to the automated 
information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



/Angela A Armstrong/ 

Primary Examiner, Art Unit 2626 



